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INTRODUCTION 


One  approach  to  the  problems  of  individual  differences 
and  performance  assessment  in  personnel  management  draws  upon 
the  theoretical  frcimework  of  human  information  processing. 

This  approach  focuses  on  the  cognitive  operations  assumed  or 
demonstrated  to  underlie  human  performance.  Other  approaches 
to  the  assessment  of  individual  differences  start  with  a 
specific  criterion  performance  in  a real-world  situation,  then 
develop  an  instrument  that  is  successful  in  predicting  actual 
performance.  Because  an  information  processing  approach  looks 
at  fundamental  parameters  of  behavior  rather  than  actual  job 
requirements,  it  can  be  used  to  understand  as  well  as  measure 
individual  differences.  In  addition,  an  information  processing 
approach  provides  the  potential  for  building  an  assessment 
instrument  that  is  applicable  to  a wide  range  of  criterion  tasks. 

This  approach  was  used  by  Rose  (1974)  in  the  development 
of  the  Information  Processing  Performance  Battery  (IPPB) . The 
basic  strategy  employed  in  that  research  was  to  select  experi- 
mental tasks  from  the  psychological  literature  that  had  been 
demonstrated  to  be  valid  measures  of  information  processing 
constructs.  Rose  selected  nine  tasks  and  administered  them 
to  large  groups  of  subjects.  Extensive  correlational  analyses 
were  conducted  to  determine  the  degree  of  relationship  among 
the  various  tasks  and  the  reliabilities  of  each.  These  pro- 
cedures resulted  in  identification  of  a set  of  tasks  which 
were  highly  reliable,  statistically  independent,  and  construct 
valid,  and  which  could  be  used  in  assessing  individual  dif- 
ferences in  a wide  variety  of  information  processing  skills. 

This  work  has  been  extended  in  a current  ONR-sponsored 
research  program  being  conducted  at  the  American  Institutes 
for  Research.  The  basic  objective  is  to  further  develop  and 
validate  the  IPPB  so  that  eventually  it  can  be  used  as  an 


assessment  device  for  the  evaluation  of  performance  in  a wide 
variety  of  situations.  The  battery  will  be  designed  to  possess 
high  reliability  and  predictive  validity  for  a wide  variety  of 
criterion  tasks.  The  battery  will  also  include  tests  that 
possess  construct  validity:  there  will  be  a firm  theoretical 
and  empirical  basis  for  inferring  the  information  processing 
structures  and  functions  that  the  tests  purport  to  measure. 

The  activities  contained  in  the  current  phase  of  the 
research  program  are  structured  around  a set  of  laboratory 
experiments.  Each  experiment  addresses  a different  domain  of 
cognitive  processes  as  represented  by  several  carefully 
selected  information  processing  tasks.  The  tasks  are  iden- 
tified primarily  through  an  extensive  review  of  texts, 
articles,  and  abstracts.  Each  task  is  evaluated  on  the  basis 
of  its  logistic  feasibility,  reliability,  and  to  a limited 
degree  its  construct  validity.  Pilot  studies  are  conducted 
to  determine  the  efficiency  of  methodological  refinements  of 
the  selected  tasks.  The  primary  questions  addressed  in  the 
experimental  studies  concern  the  replicability  of  previous 
findings,  the  construct  validity  of  the  tasks,  and  their 
adequacy  in  providing  measures  of  individual  differences. 

The  first  experiment  (Rose  & Fernandes,  1977)  consisted 
of  eight  tasks  selected  from  the  literature  on  memory,  psy- 
cholinguistics, and  visual  information  processing.  These  were 
areas  in  which  a large  body  of  research  had  accumulated  since 
the  IPPB  was  developed.  The  focus  of  the  experiment  was  on 
tasks  that  measured  functional  rather  than  structural  compo- 
nents of  human  information  processing.  The  response  measures 
for  each  task  were  interpreted  as  indicators  of  processing 
rates  and  time  durations  of  stages.  In  order  to  delimit  the 
scope  of  the  tasks,  a post  hoc  organizational  structure  con- 
sisting of  cognitive  operations  was  developed.  All  of  the 
tasks  were  specified  by  some  combination  of  operations.  The 
adequacy  of  the  specifications  was  evaluated  by  comparing  the 
pattern  of  hypothetical  operation  descriptions  with  the 
observed  responses. 


This  paper  describes  the  tasks,  methodology,  and  results 
of  the  second  experiment  carried  out  during  this  research  effort. 
This  experiment  was  designed  to  investigate  other  types  of  infor- 
mation processing  act  " -ities  that  might  be  included  as  part  of 
the  IPPB.  The  focus  in  this  case  was  on  structural  features  of 
the  information  processing  system,  those  that  describe  the 
nature  of  the  information  at  a particular  processing  stage  rather 
than  the  operations  being  performed.  The  six  tasks  in  this 
experiment  were  concerned  with  the  nature  of  memory  representa- 
tion and  provided  measures  of  various  aspects  of  encoding  and 
retrieval  of  previously  stored  information.  This  second 
experiment  was  more  limited  in  scope  than  the  first  study, 
focusing  more  on  the  logistics  of  administering  and  scoring 
the  tasks  than  on  reliability  and  validity  issues.  . 

N 

The  tasks  selected  for  pilot  testing  were  chosen  from  among 
the  various  recognition- type  and  recall-type  tasks  presented 
by  Underwood,  Boruch,  and  Malmi  (1977) . In  their  study,  it  was 
assumed  that  when  subjects  were  presented  with  a number  of  words 
to  learn,  they  would  abstract  certain  kinds  of  information  about 
each  word  and  perhaps  about  its  relationships  with  other  words 
in  the  task.  The  different  types  of  information  about  words 
that  get  stored  were  called  "attributes",  and  different  tasks 
were  selected  in  order  to  determine  the  interrelationships  among 
memory  attributes.  Some  of  the  attributes  focused  upon  proper- 
ties of  the  stored  representation,  while  others  were  concerned 
with  how  a new  chunk  of  information  is  integrated  into  previous 
knowledge.  Underwood  et  al.  included  performance  measures 
associated  with  different  tasks  in  a factor  analysis  to  deter- 
mine if  the  attributes  would  form  factors.  They  found  that 
the  factors  that  emerged  were  closely  tied  to  tasks  rather 
than  attributes.  The  authors  felt  that  individual  differences 
in  performance  and  response  patterns  on  some  of  the  tasks 
were  so  strong  as  to  override  any  effect  that  variations  in 
attributes  might  be  having  on  memory. 


Our  approach  is  to  capitalize  upon  and  extend  this  previous 
work  by  exploring  these  tasks  as  potential  measures  of  individual 
differences  that  could  be  included  in  a.  test  battery.  "Promis- 
ing" tasks  are  adapted  and  administered  to  a group  of  subjects. 

If  the  results  indicate  that,  where  ipplicable,  the  major  group 
effects  are  replicated  in  each  paradigm,  various  task  parameters 
are  identified  and  analyzed. 

The  value  of  the  paradigms  for  an  assessment  battery  depends 
primarily  on  the  measures  derived  from  them  and  the  properties 
of  these  measures  when  considered  as  potential  individual  dif- 
ference variables.  This  distinction  between  task  effects  and 
measurement  properties  is  particularly  important  in  the  present 
context  since  most  of  the  paradigms  were  not  originally  generated 
for  the  study  of  individual  differences;  the  scientists  were 
primarily  concerned  with  uncovering  different  aspects  of  the 
composition  of  the  human  memory  system.  Similarly,  most  of 
these  paradigms  have  not  previously  been  considered  as  tests 
per  se;  no  thought  has  been  given  to  typical  test  development 
issues.  The  distinction  between  group  effects  and  individual 
measures  is  critical  in  that  several  theoretically  independent 
measures  can  be  obtained  from  each  task.  For  example,  the 
Shepard  and  Teghtsoonian  task  results  can  be  described  by  a 
number  of  different  parameters:  the  "standard"  measure  of  pro- 
portion of  correct  items  (or,  more  finely,  proportion  of  "hits" 
and  "false  alarms"),  the  two  parameters  of  the  exponential 
equation  that  is  the  best  fit  to  the  probability-correct-by-lag 
function,  and  the  signal-detection-theory  parameters  d'  and  6. 

Another  important  consideration  in  the  selection  of  individual 
difference  variables  is  the  susceptibility  and  sensitivity  of 
each  to  strategies  employed  by  the  subjects.  Unless  we  "know" 
what  subjects  are  doing,  several  variables  could  be  interpreted 
as  reflecting  alternative  processes.  In  other  words,  it  would 
be  difficult  to  generate  hypotheses  concerning  underlying 
determinants  of  performance  — the  "construct  validity"  — for 
operations  or  structures  involved  in  each  task.  Selection  of 
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variables  must  also  be  constrained  by  the  analyses  that  will 
be  conducted  using  those  variables.  It  does  not  suffice  to  say 
that  an  intratask  and  intertask  correlation  matrix  will  be 
generated  in  order  to  examine  the  relationships  among  the 
variables;  the  purpose  for  such  an  examination  must  be  specified. 
In  the  previous  study  in  this  series  (Rose  and  Fernandes,  op. 
cit.),  we  were  able  (at  least  provisionally)  to  hypothesize  the 
underlying  operations  involved  for  each  variable  entered  into 
the  correlation  matrix.  Thus,  the  pattern  of  obtained  correla- 
tions was  in  a sense  a test  of  several  hypotheses.  Furthermore, 
since  all  of  our  variables  were  measured  along  a time  scale, 
relatively  unambiguous  interpretations  of  the  correlations 
could  be  made.  In  the  present  context,  however,  due  to  the 
nature  of  the  tasks  and  the  obtained  measures,  it  simply  is  not 
possible  to  identify  unique  operations  associated  with  the  tasks. 
In  many  cases,  it  is  a matter  of  either;  a)  whether  or  not  a 
subject  did  a particular  operation  or  transform  (e.g.,  whether 
or  not  a subject  encoded  words  visually) , or  b)  if  a subject  did 
perform  an  operation,  how  well  it  was  performed  (e.g.,  if  a 
subject  stored  temporal  information  in  the  list  differentiation 
task,  how  accurately  it  was  stored) . 

It  must  be  conceded  that  some  of  the  above  considerations 
and  the  judgments  regarding  inclusion  or  exclusion  of  variables 
are  a function  of  the  particular  purposes  and  perspectives  of 
the  authors.  That  is,  although  a paradigm  might  be  "good"  in 
the  sense  of  producing  theoretically  important  phenomena,  we 
might  judge  it  to  be  inappropriate  for  the  generation  of  measures 
of  human  information  processing  to  be  included  in  a test  battery. 
Ideally,  we  would  like  each  measure  to  be  identified  with  a 
single  cognitive  process  (or  a definable  set  of  processes,  each 
of  which  is  potentially  isola table  via  converging  operations) . 
Thus,  we  have  shied  away  from  variables  which  are  difficult  to 
interpret  in  terms  of  our  existing  set  of  constructs. 
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The  next  sections  provide  a detailed  description  of  the 
conduct  of  the  second  experiment.  First,  the  general  method 
is  described.  This  presentation  is  followed  by  a section  on 
each  of  the  tasks,  including  task  description,  procedure  and 
stimuli,  data  analysis,  and  results. 

General  Method 

The  experiment  was  carried  out  on  AIR  premises,  using  22 
volunteer  staff  members  as  subjects.  These  subjects  were 
unfamiliar  with  the  tasks  and  procedures.  A large  conference 
room  was  arranged  to  accommodate  the  experiment.  A projector 
controlled  by  a peripheral  timer  was  used  to  present  slides 
containing  the  stimuli.  The  timer  could  be  set  for  inter- 
stimulus intervals  ranging  from  500  to  4000  msec.  No  more 
than  eight  subjects  were  tested  at  any  one  time;  seating  was 
arranged  to  minimize  variability  of  viewing  distance  and  angle 
from  the  screen.  Subjects  were  provided  with  instruction 
booklets  and  response  sheets  for  each  task.  Subjects  partici- 
pated in  two  testing  sessions,  two  hours  in  length  and  scheduled 
two  days  apart.  All  of  the  tasks  were  presented  in  each 
session,  in  the  same  order. 

At  the  beginning  of  each  task,  the  experimenter  read  the 
instructions  and  answered  questions.  The  beginning  and  end 
of  each  stimulus  set  were  cued  both  orally  by  the  experimenter 
and  visually  with  a special  slide  on  the  screen.  Durations  of 
stimulus  presentation  and  response  generation  were  fixed,  but 
adequate  time  was  allowed  for  all  subjects  to  view  the  stimuli 
and  record  answers. 

TASK  DESCRIPTIONS  AND  RESULTS 

The  six  tasks  studied  in  this  experiment  were;  free 
recall  (control,  concrete,  and  abstract),  running  recognition, 
interference  susceptibility,  list  differentiation,  situational 
frequency,  and  memory  span.  Descriptions  of  each  of  these 
tasks  follow  below. 
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Free  Recall 

Task  description.  In  the  Underwood  et  al.  study,  free 
recall  was  tested  under  a variety  of  conditions  using  lists  of 
words  that  were  interrelated  in  certain  ways.  A control  con- 
dition was  included  to  provide  a basis  for  comparison  with  the 
concrete  and  abstract  conditions.  This  control  condition  was 
selected  in  the  present  study  as  a measure  of  short-term  memory 
capacity.  Underwood  et  al.'s  condition  comparing  recall  of  con- 
crete and  abstract  words  provided  a measure  of  encoding  by  imagery. 
The  difference  in  recall  for  concrete  and  abstract  words  was 
used  as  an  indicator  of  individual  differences  in  subjects' 
propensity  toward  concrete  memory  representations  as  the  form 
generally  used  in  short-term  memory. 

In  the  control  condition,  Underwood  gave  subjects  four 
successive  lists  of  24  words  each,  presented  for  a single  study 
and  test  trial.  The  words  were  common  five-letter  words  taken 
from  the  Thorndike-Lorge  tables.  The  number  of  words  correctly 
recalled  was  the  primary  dependent  measure.  Typically,  per- 
formance improved  with  practice.  Serial  position  curves  for 
each  list  and  for  the  four  lists  combined  were  found  to  be 
quite  symmetrical,  indicating  that  primacy  and  recency  effects 
were  essentially  equivalent. 

For  the  concrete-abstract  condition,  Underwood  et  al.  con- 
structed two  24-item  lists  of  concrete  words  and  two  corres- 
ponding lists  of  abstract  words,  all  of  the  lists  matched  for 
Thorndike-Lorge  frequency.  Testing  was  carried  out  under  the 
same  procedure  used  in  the  control  condition.  Recall  on  the 
concrete  lists  was  substantially  better  than  recall  on  the 
abstract  lists;  furthermore,  there  appeared  to  be  consistent 
individual  differences  in  performance  among  subjects. 

Procedure.  Subjects  were  shown  20  words  at  a rate  of  one 
word  every  2 seconds.  This  presentation  "ate,  which  was  faster 
than  the  4-second  rate  used  by  Underwood  et  al.,  was  used  to 
reduce  the  opportunity  for  rehearsal  between  items.  After 
presentation  of  a stimulus  list,  subjects  were  given  one 
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minute  to  write  down,  in  any  order,  all  of  the  words  that  they 
could  remember.  This  procedure  was  followed  in  both  control 
and  concrete-abstract  conditions.  Below  are  the  events  in  a 
typical  trial; 


Subject  shown  word  list 
sense 
medal 
steed 


Subject  responds 
trust 
medal 
steed 


trust 


Stimuli  and  design.  Underwood  et  al.'s  materials  for  the 
two  conditions  were  used  in  the  first  testing  session,  but  the 
lists  were  cut  from  24  to  20  words.  The  extra  words  from  the 
control  lists  were  compiled  into  2 eight-item  practice  lists, 
one  for  each  of  the  testing  sessions.  Additional  word  lists 
for  the  two  conditions  were  generated  in  the  following  manner: 

A table  of  random  numbers  was  used  to  identify  page  numbers 
within  the  Thorndike-Lorge  tables  from  which  to  make  selections. 
Words  meeting  the  constraints  in  each  condition  were  chosen 
randomly  from  among  the  words  on  these  pages.  The  lists  genera- 
ted in  this  manner  were  quite  similar  in  average  Thorndike-Lorge 
frequency  to  the  lists  developed  by  Underwood  et  al. 

Subjects  completed  one  practice  list,  followed  by  four 
lists  of  control  words,  two  lists  of  concrete  words,  and  two 
lists  of  abstract  words.  The  same  list  order  was  used  in  both 
testing  sessions. 

Data  analysis.  The  dependent  measure  was  the  proportion 
of  words  correctly  recalled  on  each  list.  These  data  were 
used  to  calculate  an  overall  mean  for  each  of  the  three  con- 
ditions. 


Results.  Table  1 summarizes  the  results  for  the  three 
conditions.  The  mean  proportion  correct  in  the  control  condi- 
tion was  .36  on  Day  1 and  .38  on  Day  2 testing.  This  trans- 
lates to  about  7 or  8 words  recalled  from  a 20-word  list,  and 
is  consistent  with  other  research  indicating  short-term  memory 
capacity  to  be  7 + 2 items  for  information  presented  in  various 
modalities. 

Mean  proportion  correct  for  concrete  words  was  .43  for 
both  testing  sessions;  for  abstract  words,  the  proportions 
were  .34  and  .37,  respectively.  When  shown  lists  of  concrete 
words,  subjects  were  able  to  recall  an  additional  one  or  two 
words  per  list  compared  to  their  performance  on  lists  of 
abstract  words.  Although  the  improvement  in  recall  was  small, 
it  suggests  that  subjects  tend  to  use  concrete  rather  than 
abstract  representations  in  short-term  memory. 

Test-retest  reliabilities  were  .79,  .79,  and  .72  for  the 
control,  concrete,  and  abstract  conditions,  respectively, 
indicating  that  performance  on  the  task  was  consistent  from 
one  session  to  the  next  for  all  three  conditions.  All  of  the 
measures  showed  nonsignificant  practice  effects.  As  a group, 
subjects  improved  only  slightly  from  one  testing  session  to 
the  next  on  the  control  and  abstract  lists;  performance  on  the 
concrete  lists  was  the  same  on  both  days. 

Although  the  pattern  of  Underwood  et  al.'s  results  was 
replicated  in  the  current  experiment,  overall  performance  levels 
were  consistently  lower  than  those  observed  by  Underwood  et  al. 

In  their  study,  mean  recall  for  control,  concrete,  and  abstract 
lists  was  12,  15,  and  11  words,  respectively.  The  lower  recall 
scores  in  the  present  experiment  can  be  attributed  to  differences 
in  the  educational  backgrounds  of  the  subjects  tested  and  to 
procedural  modifications,  the  most  significant  of  which  involved 
the  presentation  sequence  for  the  three  conditions.  Where 
Underwood  et  al.  administered  the  control  lists  and  the  concrete 
and  abstract  lists  on  different  days,  the  current  experiment 
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presented  the  three  list  types  in  both  testing  sessions,  in 
the  seune  order  each  session.  Practice  on  earlier  lists  may 
have  improved  recall  on  the  abstract  lists,  hence  reducing  the 
magnitude  of  the  differences  between  concrete  and  abstract  words 
in  the  present  study.  Underwood  et  al.'s  presentation  rate 
was  also  longer,  perhaps  enabling  the  subjects  to  rehearse 
the  words  more  thoroughly. 

Running  Recognition 

Task  description.  Underwood  et  al.  used  a variation  of 
the  paradigm  developed  by  Shepard  and  Teghtsoorian  (1961)  to  \ 

measure  certain  aspects  of  memory  search  and  identification. 

In  the  original  task,  subjects  were  presented  with  a lengthy 
list  of  three-digit  numbers  and  were  asked  to  identify  each 
number  as  "old"  (i.e.,  previously  presented)  or  "new".  The 
lists  were  constructed  so  that  the  intralist  intervals  between 
the  original  and  test  presentations  of  items  varied.  A reten- 
tion function  for  a single  item  was  inferred  by  plotting  prob- 
ability of  recognition  as  a function  of  test  lag.  In  addition, 
estimates  of  several  signal  detection  parameters  were  generated 
for  each  subject.  In  the  Underwood  et  al.  study,  the  stimulus 
items  were  words  rather  than  numbers,  and  the  lists  were  con-  "f 

structed  to  evaluate  the  importance  of  the  acoustic  attribute 
in  memory  for  words.  The  pattern  of  hits,  misses,  and  false 
alarms  obtained  by  Underwood  et  al.  did  not  support  their 
hypotheses  about  this  attribute. 

The  Shepard  and  Teghtsoonian  paradigm  was  one  of  the  tasks 
included  in  the  first  AIR  experiment  (Rose  & Fernandes,  1977). 

The  paradigm  was  selected  for  that  study  because  it  provided 
a measure  of  memory  capacity  when  the  possibility  of  rehearsal 
is  minimized  and  the  interference  of  preceding  material  is 
maximized.  The  results  obtained  from  the  paradigm  paralleled 
those  of  Shepard  and  Teghtsoonian.  The  adaptation  of  the  task 
by  Underwood  et  al.  was  of  interest  in  the  present  study  because  it 
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provided  an  opportunity  to  determine  if  a change  in  stimulus 
mode  would  provide  results  similar  to  the  pattern  obtained  in 
the  first  AIR  experiment. 


Procedure.  In  this  task,  subjects  decided  whether  or  not 
they  remembered  having  seen  each  word  earlier  in  the  series. 

If  they  had  seen  the  word  previously  in  the  list,  they  were  to 
respond  "old";  if  they  had  not  seen  the  word  before,  they  were 
to  respond  "new".  Subjects  had  3 seconds  in  which  to  mak''- 
their  judgment  and  to  respond  before  the  next  word  appeared. 
Below  are  the  events  in  a typical  trial: 

Subject  views  word  Subject  responds 

mail  "new" 

shone  " new" 

• • 

• • 

• • 

mail  "old" 

Stimuli  and  design.  Underwood  et  al.'s  lists  were  used  to 
identify  two  sets  of  51  words  which  appeared  in  two  lists  in 
the  present  experiment.  The  stimulus  series  developed  by  Rose 
and  Fernandes  (1977)  was  used  to  determine  the  word  order  in 
each  list.  Each  number  in  the  series  was  replaced  by  one  of 
the  words.  Thus,  the  word  lists  in  the  current  study  possessed 
the  same  characteristics  as  the  number  lists  in  the  earlier 
study.  The  lists  were  101  words  in  length.  With  a single 
exception,  every  word  in  a given  list  appeared  exactly  twice. 
The  second  presentations  of  the  words  were  placed  so  that 
several  lags  between  first  occurrence  and  subsequent  test  were 
represented.  The  lags  used  were  1,  2,  4,  8,  12,  16,  20,  24, 

30,  and  36  items,  with  five  exemplars  of  each  lag  in  a given 
list.  Two  lists  were  created,  one  for  each  testing  session. 


Data  analysis.  Two  types  of  measures  were  used: 
"traditional"  retention  p2u:auneters  and  parameters  derived  from 
signal  detection  theory.  Of  the  traditional  measures,  the  two 
employed  here  were  proportion  correct  (i.e.,  x/101)  and  a two- 
parameter  estimate  of  the  best-fitting  curve  for  the  probability- 
correct-  by-  lag  function.  This  function  was  characterized  by 
the  least  squares  estimate  for  A and  B in  the  equation: 

y = Ax  , where  A = intercept  and  B = exponent. 

These  parameters  were  calculated  for  each  subject  for  each 
testing  session. 

The  signal  detection  parameters  were  derived  from  two 
observed  scores,  namely: 

(a)  the  probability  that  the  subject  responded  "old" 
when  the  stimulus  was  old  [p("0"/0)],  or  "hits";  and 

(b)  the  probability  that  the  subject  responded  "old" 

when  the  stimulus  was  new  [p("0"/N)],  or  "false  alarms." 

The  signal  detection  discrimination  parameter  d'  was  calculated 
as  the  normal  deviate  of  (a)  plus  the  normal  deviate  of  (b) . 

Beta  (B)  was  calculated  as  the  normal  ordinate  of  (a)  divided 
by  the  normal  ordinate  of  (b) . 

Results.  Figure  1 summarizes  the  effect  of  delay  on  the 
accuracy  of  classifying  an  old  item  as  "old".  The  functions 
shown  are  for  Day  1 and  Day  2 sessions  in  the  current  study 
and  for  the  combined  data  from  the  Rose  and  Fernandes  study. 
Although  the  curves  are  very  rough,  they  represent  the  course 
of  forgetting  as  delays  get  longer.  In  both  studies,  the 
likelihood  of  recognizing  an  old  item  as  "old"  is  almost 
perfect  when  the  item  was  just  seen.  However,  beyond  this 
point,  the  functions  for  the  two  studies  differ  substantially. 

The  curves  in  the  current  experiment  show  a much  more  gradual 
decline  with  increasing  test  delays.  Even  with  the  maximum 
delay  of  36  items,  the  probability  of  correctly  recognizing 
an  old  word  is  about  .8,  compared  to  a probability  of  .7  when 
the  items  are  numbers.  The  pareimeters  of  the  lag  functions 
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Day  1 , current  study 
Day  2,  current  study 
Rose  and  Fernandes,  1977 


Delay  Since  First  Presentation  of  Stimulus 
(number  of  intervening  presentations) 


Figure  1.  Lag  function  for  the  Running  Recognition  task 
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in  the  current  experiment,  presented  in  Table  2,  were  identical 
for  both  testing  sessions  and  hence  showed  nonsignificant  prac- 
tice effects.  The  reliability  for  each  parameter  was  low  and 
nonsignificant. 

Overall  performance  on  the  task  was  quite  good.  Subjects 
correctly  recognized  more  than  90  percent  of  the  words  presented 
in  both  sessions.  Their  "hit"  rate  was  .90  in  both  sessions, 
and  their  "false  alarm"  rate  was  .06  and  .03  for  Day  1 and  Day  2, 
respectively.  All  three  measures  had  high  test-retest  reliability, 
and  two  of  the  three  showed  significant  practice  effects.  Recog- 
nition in  the  current  experiment  was  more  accurate  than  recogni- 
tion in  the  Rose  and  Fernandes  study  where  the  "hit"  and  "false 
alarm"  probabilities  were  .75  and  .30. 

A number  of  subjects  in  the  current  experiment  performed 
this  task  with  no  errors.  Because  the  signal  detection  para- 
meters d'  and  6 are  undefined  under  these  conditions,  arbitrary 
values  of  5 and  20  were  used  in  the  analysis  of  these  subjects' 
data.  As  a result,  interpretation  of  group  means  for  the  two 
parameters  is  not  meaningful,  and  a comparison  between  experi- 
ments is  not  possible.  The  results  for  these  parameters  have 
been  omitted  from  Table  2. 

Interference  Susceptibility 

Task  description.  In  the  Underwood  et  al.  study,  this 
task  provided  a measure  of  individual  differences  in  suscep- 
tibility to  interference  by  associations  established  in  a series 
of  paired-associate  lists.  A list  consisted  of  five  word- 
number  pairs  presented  for  a single  study  and  test  trial.  The 
procedure  within  a set  of  lists  remained  the  same  across  lists; 
the  lists  would  contain  the  same  words  but  they  would  be  paired 
with  the  numbers  in  different  combinations  and  would  be  presen- 
ted in  a different  order.  Subjects  were  presented  with  six 
sets  of  such  lists.  It  was  expected  that  performance  would 
decrease  within  each  set  and  also  decrease  across  sets.  An 
analysis  of  the  number  of  items  correct  indicates  that  performance 
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t-Value 


Varicible 

N 

Mean 

Dev. 

Min. 

Max. 

Pel. 

Day  1 vs  2 

Proportion 

Correct 

Day  1 

22 

.92 

.05 

.77 

.99 

.82** 

2.17* 

Day  2 

22 

.94 

.05 

.81 

.99 

Exponoi^t 

Day  1 

22 

-.044 

.07 

-.24 

.09 

.37 

.14 

Day  2 

22 

-.042 

.07 

-.22 

.06 

Intercept 

Day  1 

22 

.98 

.10 

.69 

1.12 

.23 

-.12 

Day  2 

22 

.977 

.10 

.79 

1.21 

P ("hits")  Day  1 22  .90  .08  .62  1.00 

Day  2 22  .89  .10  .62  1.00 

P ("false 

alarms")  Day  1 22  .06  .06  .00  .20 

Day  2 22  .03  .04  .00  .20 


77** 


-.60 


60** 


-2.43* 


.1 


decreased  across  lists  within  sets  as  expected;  however,  a 
decrease  across  sets  was  not  consistently  obtained. 

This  task  is  similar  to  running  recognition  in  that  both 
create  conditions  where  the  possibility  of  rehearsal  is  mini- 
mized and  the  interference  of  preceding  material  is  maximized. 
Because  interference  susceptibility  measures  recall  rather 
them  recognition,  this  task  focuses  on  the  storage  and  retrieval 
of  information  rather  than  on  storage  alone. 

Procedure.  Subjects  were  shown  five  word-number  pairs 
at  a rate  of  one  pair  every  3 seconds.  A special  slide  then 
appeared  which  cued  the  end  of  the  list  and  the  beginning  of 
the  recall  task.  Each  word  was  shown  by  itself  (in  a different 
order  from  that  used  in  the  study  presentation)  for  4 seconds, 
and  subjects  recalled  the  number  with  which  it  had  been  paired. 
Below  is  an  excimple  of  a typical  trial; 


I' 

1 

Subject  studies 
list 

Subject  shown 
probe 

Subject 

responds 

(. 

NOB- 5 

RAP 

2 

f; 

HEW- 4 

PEG 

3 

JIG-1 

NOB 

5 

1 

PEG- 3 

JIG 

1 

1. 

RAP- 2 

HEW 

4 

Stimuli  and  design.  In  each  session,  there  were  six 
sets  of  lists,  each  set  containing  four  lists.  Each  list 
within  a set  used  the  saune  group  of  three-letter  words.  The 
words  in  the  first  list  of  a set  were  paired  with  a number 
ranging  from  1 to  5.  In  the  three  remaining  lists  of  the  set, 
the  words  were  paired  with  different  combinations  of  the  same 
numbers.  Underwood  et  al.'s  materials  were  used  in  the  first 
testing  session.  Additional  words  for  Day  2 were  generated 
from  the  Thorndike-Lorge  tables  in  the  manner  described  for 
free  recall  and  were  checked  for  comparability  in  terms  of 
word  frequency  with  the  Underwood  et  al.  stimuli. 
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Data  analysis.  The  dependent  measure  on  this  task  was  the 
proportion  of  items  correct  per  list.  These  data  were  used  to 
calculate  the  means  for  successive  lists  within  sets  and  an 
overall  mean  for  each  of  the  six  sets.  In  addition,  the  slope 
of  the  best  fitting  line  relating  proportion  correct  as  a func- 
tion of  list  number  within  sets  was  calculated  for  each  subject. 

Results.  Figure  2 indicates  performance  on  successive 
lists,  collapsed  across  sets,  for  the  two  testing  sessions  and 
for  the  Underwood  et  al.  study.  With  the  exception  of  the 
final  list  in  the  second  testing  session,  the  proportion  correct 
decreased  with  successive  lists,  indicating  that  associations 
formed  in  early  lists  interfere  with  recall  in  later  lists. 

The  slope  of  the  linear  function,  shown  in  Table  3,  was  -.03 
and  -.02  for  the  two  testing  sessions.  These  figures  compare 
favorably  with  the  slope  of  -.03  for  the  Underwood  et  al.  study 
when  their  data  were  re-analyzed  in  terms  of  proportion  correct. 
In  both  studies,  recall  decreased  by  nearly  half  a word  from 
the  first  list  to  the  fourth  list  in  a set. 

Although  the  pattern  of  results  was  similar  in  the  two 
studies,  subjects  in  the  current  experiment  performed  the  task 
less  accurately  than  subjects  in  the  Underwood  et  al.  study. 

The  overall  proportion  correct  (shown  in  Table  3)  was  .61  for 
Day  1 and  .69  for  Day  2.  The  improvement  from  one  session  to 
the  next  was  significant,  but  performance  was  still  well  below 
the  value  of  .85  obtained  by  Underwood  et  al.  The  test-retest 
correlation  for  overall  proportion  correct  was  high,  indicating 
that  performance  was  generally  consistent  from  one  session  to 
the  next. 

Several  reliability  estimates  described  by  Underwood  et  al. 
were  calculated  in  the  present  experiment.  The  correlation  of 
proportion  correct  for  sets  1,  3,  and  5 with  sets  2,  4,  and  6 
was  .87  for  both  days;  the  correlation  for  the  sum  of  lists  1 
and  3 across  sets  with  the  sum  of  lists  2 and  4 was  .85  for 
Day  1 and  .89  for  Day  2.  The  two  sets  of  figures  compared 
favorably  with  those  of  Underwood  et  al.  who  obtained  the  same 
value  of  .81  for  both  calculations. 
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Figure  2.  Mean  proportion  correct  for  successive  lists  within  sets  for  the 
Interference  Susceptibility  task. 
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Descriptive  Measures  for  the 
Interference  Susceptibility  Task 


Variable 

N 

Mean 

Std. 

Dev. 

Min. 

Max . 

Rel. 

t-Value 
Day  1 vs  2 

Proportion  Correct/ 

Total 

Day 

1 

22 

.61 

.19 

.29 

.95 

.77** 

2.68* 

Day 

2 

22 

.69 

.19 

.34 

.98 

List  Function/ 

Slope 

Day 

1 

22 

-.03 

.01 

-.16 

.66 

.05 

.52 

Day 

2 

22 

-.02 

.04 

-.10 

.53 

*p  < .05 

**p  < .01 


5 ! 

■ I 


il 
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Figure  3 shows  the  proportion  correct  on  each  of  the  sets 
for  the  two  testing  sessions.  Although  performance  improves 
substantially  from  set  l to  set  2,  there  is  considerable 
variability  in  performance  on  the  remaining  sets.  Subjects 
generally  become  more  accurate,  scoring  more  items  correctly 
on  set  6 than  on  set  1.  Overall  performance  is  better  on  Day 
2 than  Day  1,  but  this,  too,  is  not  consistent  across  the  sets. 

Some  subjects  in  the  current  study  reported  trying  out 
different  strategies  for  dealing  with  the  task.  Once  a 
successful  strategy  was  found,  these  subjects  made  very  few 
errors  on  subsequent  lists.  This  would  account  in  part  for 
the  irregular  effects  of  practice  across  the  six  sets  of  lists. 
It  would  also  explain  why  some  subjects  obtained  positive 
rather  than  negative  slopes  (see  Table  3) , indicating  a facili- 
tation rather  than  an  interference  effect.  The  low  reliability 
for  the  slope  measure  probably  reflects  the  subjects'  efforts 
to  find  a successful  strategy  since  test-retest  correlations 
are  sensitive  to  changes  of  this  sort. 

Given  that  running  recognition  and  interference  suscepti- 
bility provide  estimates  of  the  same  construct,  it  was  expected 
that  performance  on  the  two  tasks  would  be  correlated.  Total 
proportion  correct  on  interference  susceptibility  and  proportion 
correct,  hits,  and  false  alarms  on  running  recognition  were 
selected  for  analysis  because  they  were  measures  with  high 
reliabilities  in  the  current  experiment.  The  correlations 
between  the  interference  susceptibility  measure  and  each  of 
the  running  recognition  measures  were  .57,  .41,  and  -.35  for 
Day  1 and  .49,  .29,  and  -.32  for  Day  2.  All  of  these  figures 
were  in  the  direction  expected,  but  only  one  of  them  was  sig- 
nificant (r  = .57,  p < .01). 

Situational  Frequency 

Task  description.  This  task  was  selected  in  the  present 
experiment  to  provide  a measure  of  the  "frequency"  attribute 
of  a memory  representation,  i.e.,  the  ability  to  determine 


the  number  of  timed  that  a piece  of  information  occurs.  In 
the  Underwood  et  al.  study,  subjects  were  shown  a list  of  words, 
then  judged  the  frequency  of  occurrence  of  each  of  the  %rards. 

The  primary  measure  was  the  correlation  between  true  frequen- 
cies and  judged  frequencies  for  the  words. 

Procedure.  Subjects  were  shown  a list  of  92  words  at  a 
rate  of  2 seconds  per  word.  At  the  end  of  the  list,  they  were 
given  a response  form  containing  all  of  the  words  from  the  list 
plus  some  that  had  not  been  presented.  Subjects  had  4 minutes 
during  which  to  judge  the  actual  frequency  of  occurrence  of 
each  word.  The  events  in  a typical  trial  are  shown  below; 


Subject  shown 
list 

elfin 

starlight 

limbo 

artful 


Subject  shown 
word 

starlight 

quibble 

elfin 


artful 


Subject 

responds 


elfin 


limbo 


Stimuli  and  design.  Underwood  et  al.'s  word  lists  were 
used  in  both  sessions  of  the  current  study.  Each  list  contained 
92  words:  12  words  presented  once,  12  presented  twice,  12  pre- 
sented three  times,  and  four  presented  five  times.  The  words 
were  all  of  two  syllables  and  had  Thorndike-Lorge  frequencies 
falling  between  1 and  10.  For  the  current  experiment,  the 
twelve  words  that  appeared  in  the  test  lists  but  not  in  the 
initial  stimulus  lists  were  selected  according  to  the  procedure 
described  for  the  free  recall  task.  They  were  comparable  to 
the  words  used  by  Underwood  et  al.  in  terms  of  Thorndike-Lorge 
frequencies.  The  response  form  contained  the  40  words  from 
the  list  and  the  12  additional  words,  presented  in  randomized 
order.  Two  sets  of  lists  and  response  forms  were  prepared, 
one  for  each  testing  session. 
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Data  analysis.  The  dependent  measure  was  the  mean  judged 
frequencies  for  each  actual  frequency  level  (0,  1,  2,  3,  and 
5) . The  judged  frequencies  were  correlated  with  actual  fre- 
quencies, and  the  slope  of  the  best  fitting  linear  function 
relating  these  data  was  calculated  for  each  subject. 

Results.  Figure  4 shows  the  relationship  between  actual 
and  judged  frequencies  in  the  current  experiment.  Subjects 
tend  to  overestimate  actual  frequencies  of  0 and  1,  but  to 
underestimate  actual  frequencies  greater  than  1.  This  pattern 
is  more  pronounced  on  Day  2 than  on  Day  1.  The  correlation 
between  the  frequencies,  shown  in  Table  4,  was  .82  for  the 
list  used  on  Day  1 and  .80  for  the  list  used  on  Day  2.  These 
figures  compare  favorably  with  the  correlations  of  .87  and  .85 
reported  by  Underwood  et  al.  for  the  two  lists.  The  slope  of 
the  line  relating  actual  and  judged  frequencies,  also  shown 
in  Table  4,  can  be  used  to  estimate  the  accuracy  of  subjects' 
frequency  judgments.  The  slope  was  .89  on  Day  1 and  .90  on 
Day  2,  indicating  that  subjects  in  the  current  experiment 
erred  on  the  low  side  in  their  judgments. 

The  reliability  of  the  correlation  and  slope  measures  was 
.69  and  .82,  respectively.  The  effect  of  practice  on  the  two 
measures  was  small  and  nonsignificant. 

List  Differentiation 

Task  description.  This  task  focused  on  the  "temporal" 
attribute  of  a memory  representation,  i.e.,  the  ability  to 
order  incoming  information  on  the  time  dimension.  Underwood 
et  al.  presented  three  successive  lists  of  words  to  subjects, 
then  asked  them  to  indicate  in  which  list  each  word  had  occurred. 
The  primary  response  measure  was  the  number  of  errors  per  list. 

Procedure.  Subjects  were  shown  three  successive  lists  of 
20  four-letter  words  each  at  a rate  of  one  word  every  2 seconds. 
Subjects  were  cued  orally  and  with  a special  slide  when  each 
list  ended  and  the  next  one  began.  At  the  end  of  the  third 
list,  subjects  were  given  a response  sheet  containing  the  60 
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words  presented  in  a randomized  order.  They  were  given  3 
minutes  during  which  to  indicate  the  list  (1,  2,  or  3)  in 
which  the  word  had  appeared.  The  events  in  a typical  trial 
are  shown  below. 


Subject  shown 


Subject  shown 
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Stimuli  and  design.  Two  sets  of  words,  each  set  containing 
three  lists,  were  presented  in  each  testing  session.  Underwood 
et  al.'s  materials  were  used  in  the  first  session.  The  lists 
for  the  second  session  were  generated  from  the  Thorndike-Lorge 
tables  in  the  manner  described  for  free  recall  and  were  checked 
for  comparability  in  terms  of  word  frequency  with  Underwood 
et  al.  stimuli. 


ifdyjiJisisw 


I. 

r. 

i: 

1. 

I. 

i: 

I. 

1. 

1. 

i; 

i; 

i 

I 

I 

I 

I 

I 


Data  analysis.  The  dependent  measure  was  the  proportion 
of  items  correct  on  each  list.  These  data  were  used  to  calcu- 
late the  means  for  each  set  of  lists. 

Results.  Figure  5 represents  performance  on  the  three 
lists  for  the  two  testing  sessions  and  for  the  Underwood  et  al. 
study.  In  both  experiments,  the  proportion  correct  decreases  with 
successive  lists,  indicating  that  subjects'  judgments  were 
more  accurate  for  words  presented  in  earlier  rather  than  more 
recent  lists.  Recall  dropped  by  about  half  a word  with  each 
successive  list. 

Table  5 summarizes  performance  on  the  two  sets  of  lists 
for  the  two  testing  sessions.  The  mean  proportion  correct  for 
set  1 was  .45  on  Day  1 and  .51  on  Day  2;  for  set  2,  the  propor- 
tion was  .49  and  .55  for  the  two  sessions.  This  translates  in- 
to an  improvement  in  judgment  of  about  one  word  per  set  from 
one  session  to  the  next.  This  improvement  was  significant 
for  both  sets.  Test-retest  reliabilities  were  .83  and  .70, 
indicating  that  both  measures  were  highly  reliable.  The  sub- 
jects in  the  current  experiment  performed  the  task  with  increas- 
ing accuracy  so  that  by  the  final  set  of  lists,  performance  was 
comparable  to  the  level  reported  by  Underwood  et  al. 

Memory  Span 

Task  description.  Underwood  et  al.  used  a memory  span  for 
letters  task  to  provide  a measure  of  the  "acoustic"  attribute 
in  memory.  One  set  of  strings  contained  letters  with  high 
acoustic  similarity  (e.g.,  B,  C,  E,  G)  and  one  set  with  low 
acoustic  similarity  (e.g.,  J,  L,  R,  W) . Subjects  were  presented 
with  strings  of  6,  7,  8,  and  9 letters,  low  similarity  strings 
first,  followed  by  high  similarity  strings.  Each  string  was 
scored  in  terms  of  the  number  of  letters  correct.  As  expected, 
the  percent  correct  letters  decreased  considerably  as  string 
length  increased.  Although  the  difference  between  high  and 
low  similarity  strings  was  reliable,  the  magnitude  of  the 
effect  was  less  than  expected. 
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Figure  5.  Mean  proportion  correct  for  successive  lists  on  the  List 
Differentiation  task. 


This  task  was  similar  to  the  control  condition  in  free 
recall  in  that  both  provide  a measure  of  short-term  memory 
capacity.  In  addition,  the  difference  between  high  and  low 
similarity  strings  was  used  as  an  indicator  of  individual 
differences  in  subjects'  reliance  on  acoustic  representations 
in  memory. 

Procedure.  Subjects  were  presented  with  a letter  string, 
at  a rate  of  one  second  per  letter.  They  were  then  given  10 
seconds  in  which  to  write  down  the  letters  in  the  same  order 
in  which  they  were  presented.  The  answer  sheets  were  marked 
to  indicate  the  number  of  letters  in  each  string.  Subjects 
were  told  to  leave  blank  the  appropriate  spaces  for  letters 
they  could  not  remember.  The  events  in  a typical  trial  are 
shown  below. 

Subject  shown  letters 

Z 
D 
T 
B 
E 
C 

Stimuli  and  design.  Forty-two  letter  strings  were  pre- 
sented in  each  testing  session;  half  of  the  strings  contained 
letters  with  high  acoustic  similarity  (B,  C,  D,  E,  G,  P,  T,  V, 
and  Z)  and  half  contained  letters  with  low  acoustic  similarity 
(B,  H,  J,  L,  0,  K,  R,  W,  and  Y) . Subjects  were  presented  with 
five  practice  strings  of  five  letters  each,  followed  by  four 
strings  each  of  6,  7,  8,  and  9 letters.  The  16  high  similarity 
strings  were  displayed  first  followed  by  16  low  similarity 
strings,  in  contrast  to  Underwood  et  al.  who  presented  low 
similarity  strings  first. 

Underwood  et  al.'s  letter  strings  were  the  stimuli  for 
the  second  session  but  were  presented  in  a different  order. 
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Data  analysis.  For  each  subject,  the  proportion  of  letters 
correct  was  calculated  for  each  string  length  for  both  similarity 
types.  In  addition,  an  estimate  of  memory  capacity  was  computed 
for  both  similarity  types  by  calculating  the  average  number  of 
correct  letters  per  string  (i.e.,  number  of  letters  correct, 
divided  by  16,  as  there  were  16  strings  of  each  type  per  session) . 

Results.  Figure  6 shows  performance  for  the  various  string 
lengths  and  letter  confusability . As  expected,  the  proportion 
correct  decreases  with  increasing  string  length.  Performance 
is  more  accurate  on  Day  2 than  on  Day  1 and  on  low  similarity 
than  on  high  similarity  strings.  Table  6 presents  the  source 
table  for  an  analysis  of  variance  calculated  on  the  data  for 
this  task.  The  effects  of  testing  session  and  letter  similarity 
were  both  significant  (p  < .01)  as  was  the  linear  component  of 
string  length. 

Although  the  pattern  of  results  in  the  current  experiment 
paralleled  those  obtained  by  Underwood  et  al.,  there  were  dif- 
ferences in  the  magnitude  of  certain  effects  in  the  two  studies. 

In  Underwood  et  al.'s  study,  the  effect  of  string  length  was 
pronounced.  Subjects'  performance  dropped  from  about  85  per- 
cent correct  on  six-letter  strings  to  about  20  percent  on  nine- 
letter  strings.  In  contrast,  performance  in  the  current 
experiment  ranged  from  50  to  90  percent  at  the  shortest  string 
length  but  dropped  only  10  to  20  percent  as  the  number  of  letters 
in  a string  increased.  These  performance  levels  probably  reflect 
subject  differences  in  the  two  studies  as  was  commented  upon 
previously  for  the  other  tasks.  Where  Underwood  et  al.  obtained 
a small  acoustic  confusion  effect,  the  results  of  the  current 
experiment  showed  large  and  consistent  differences  between  high 
and  low  similarity  letters  at  every  string  length  and  for  both 
testing  sessions.  The  major  procedural  modification  in  the 
current  experiment  was  to  present  the  high  similarity  strings 
prior  to  the  low  similarity  strings  in  both  sessions,  just  the 
reverse  of  the  sequence  used  by  Underwood  et  al.  This  means 
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□ Low  similarity 
■ High  similarity 


String  Length 


Figure  6.  Mean  proportion  correct  as  a function  of  string  length  on  the 
Memory  Span  task. 
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that  any  practice  effects  would  probably  have  enhanced  the 
effect  of  letter  conf usability  in  the  current  experiment  but 
diminished  the  effect  in  the  Underwood  et  al.  study. 

Table  7 presents  the  results  in  terms  of  average  number 
of  correct  items  for  high  and  low  similarity  strings  and  for 
the  two  testing  sessions.  The  pattern  of  effects  is  similar 
to  that  shown  in  Figure  6.  Memory  span  improved  by  more  than 
half  a word  from  Day  1 to  Day  2 and  by  more  than  one  and  a 
half  words  from  high  to  low  similarity  strings.  The  increase 
associated  with  testing  session  was  significant  for  both  types 
of  strings.  The  test-retest  correlations  were  .86  and  .82, 
indicating  high  reliability  in  both  instances. 

Memory  capacity  as  measured  in  this  task  was  about  4 
letters  on  high  similarity  strings  and  about  5.5  letters  on 
low  similarity  strings.  Because  the  maximum  value  possible 
was  7.5,  these  figures  are  considerably  lower  than  the  capacity 
estimate  of  7 to  8 words  obtained  in  the  free  recall  task. 

Given  that  memory  span  and  free  recall  provide  measures  of  the 
same  construct,  it  was  expected  that  performance  on  the  two 
tasks  would  be  correlated.  Table  8 shows  the  correlations 
between  average  number  correct  on  memory  span  and  proportion 
correct  on  free  recall.  Half  of  the  correlations  were  signifi- 
cant, most  of  them  associated  with  performance  on  the  second 
testing  session. 

Correlations  Within  and  Between  Tasks 

The  inter-  and  intra-task  correlations  were  calculated 
for  the  16  measures  presented  in  the  previous  section.  In 
addition  to  these  16,  the  correlations  for  three  other  variables 
are  included  in  these  matrixes.  These  variables  are  the  d' 
and  8 parameters  from  the  Running  Recognition  task  and  the 
intercept  measure  from  the  Situational  Frequency  task.  Although 
these  three  measures  were  judged  as  inappropriate  for  discussion 
of  group  effects  due  primarily  to  potential  scoring  artifacts 
which  would  contaminate  the  interpretation  of  the  group  data. 
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Memory  Span 
Low  Sim. 
High  Sim. 


Free  Recall 

Control 

Concrete 

Abstract 

•52*/. 70* 

.42/. 63* 

.34/. 57* 

.54*/. 54* 

.38/. 65* 

.13/. 45 

* p < .01,  df  = 20,  one  detailed  test. 
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they  were  considered  as  potentially  valuable  as  individual 
difference  variables.  The  correlation  matrixes  for  each  day 
are  shown  in  Table  9a  and  9b.  Because  of  the  relatively  small 
number  of  subjects,  additional  factor  and  regression  analysis 
were  not  conducted. 

Five  of  the  six  tasks  showed  significant  within-task 
correlations  on  one  or  both  days;  the  exception  was  Interference 
Susceptibility.  Although  Underwood  et  al.  reported  within- 
task  correlations,  differences  in  response  measures  in  their 
study  and  the  current  one  limit  comparisons.  Both  studies, 
however,  used  the  same  measures  in  the  Free  Recall  and  Memory 
Span  tasks,  and  the  magnitude  of  the  correlations  in  both  tasks 
was  similar. 

Several  of  the  inter-task  correlations  were  mentioned  in 
the  previous  section  for  tasks  that  shared  the  same  construct 
and  hence  should  have  been  correlated.  The  pattern  of  data 
in  the  two  matrixes  indicates  that  many  of  the  significant 
correlations  are  between  the  Free  Recall  measures  and  other 
variables,  particularly  for  Day  2 performance.  Free  Recall, 

List  Differentiation,  Situational  Frequency,  and  Memory  Span 
were  considered  to  be  promising  tasks  in  the  current  study 
because  they  provided  measures  of  different  attributes  of 
memory.  The  correlations  among  the  four  tasks,  however,  sug- 
gest that  performance  may  have  been  dependent  more  upon  general 
memory  processes  than  upon  the  effects  of  the  attributes. 

DISCUSSION  AND  CONCLUSIONS 

As  mentioned  above,  the  appropriateness  of  the  tasks  for 
inclusion  in  a test  battery  must  be  evaluated  in  terms  of  three 
criteria;  the  logistic  feasibility  of  the  adaptations,  the 
replicability  of  previous  findings,  and  the  adequacy  of  the 
tasks  to  provide  measures  of  individual  differences.  Each  of 
these  issues  will  be  addressed  in  turn. 

The  adaptations  made  in  the  procedures  and  materials  used 
by  Underwood  et  al.  were  successful  in  the  current  study.  The 
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subject  group  used  by  Underwood  et  al.  differed  substantially 
from  that  in  the  current  experiment.  Whereas  Underwood  et  al. 
tested  college  students,  the  AIR  sample  was  more  varied  in 
educational  background.  Despite  these  differences,  the  tasks 
were  easily  and  quickly  administered  and  scored,  and  the  sub- 
jects understood  what  was  required  of  them.  All  of  the  tasks 
thus  met  the  criterion  of  logistic  feasibility. 

Although  subjects  in  the  current  experiment  were  less 
accurate  than  those  in  the  Underwood  et  al.  study,  the  results 
generally  replicated  previous  findings.  The  major  exception 
was  the  Interference  Susceptibility  task  where  some  subjects 
reported  trying  out  different  non-memory-related  strategies 
to  avoid  interference.  For  example,  one  subject  mentioned  that 
he  used  the  initial  letter  of  each  word  to  form  a five-letter 
word  for  a given  trial.  One  indication  of  the  strategy  shift 
was  the  low  reliability  for  the  slope  measure;  another  was  the 
difference  in  the  pattern  of  correlations  among  the  slope  mea- 
sure and  other  variables  in  the  two  matrixes.  Furthermore, 
although  recall  decreased  across  lists  in  a set  as  expected, 
group  performance  was  quite  variable  and  the  effects  were  not 
robust.  Because  of  the  marked  strategy  shift  and  variability 
in  group  performance,  the  Interference  Susceptibility  task  was 
dropped  from  consideration  for  the  battery. 

Judging  the  replicability  of  the  Running  Recognition  task 
posed  special  problems  since  it  more  closely  resembled  a para- 
digm in  the  first  AIR  experiment  than  the  task  used  by 
Underwood  et  al.  The  results  of  the  current  study  indicated 
that  the  task  was  much  easier  when  the  stimuli  were  words  than 
when  they  were  numbers.  A number  of  subjects  performed  the 
task  with  no  "false  alarm"  errors  and  as  a result,  several  of 
the  measures  were  probably  affected  by  a ceiling  effect.  As 
mentioned  previously,  the  signal  detection  parameters,  d'  and 
6,  were  omitted  from  the  analysis  of  group  data  for  this  rea- 
son. In  addition,  the  intercept  and  exponent  measures  were 
probably  affected.  All  four  measures  had  low  reliabilities 


(d'l  r = .28,  6:  r = .20).  Although  the  false  alarm  measure 
was  reliable,  it  was  probably  affected  by  a "floor"  effect 
since  performance  was  frequently  errorless.  The  results  did 
not  support  the  inclusion  of  these  five  measures  in  the  battery. 

The  between-task  correlations  were  examined  to  evaluate 
the  adequacy  of  the  tasks  to  provide  measures  of  individual 
differences.  In  making  these  judgments,  the  correlations  were 
examined  to  determine  if  different  variables  were  measuring 
different  aspects  of  performance.  If  there  were  several  "redun- 
dant" variables,  the  one  least  likely  to  be  affected  by  dif- 
ferences in  subjects'  strategies  was  selected  for  inclusion  in 
the  battery.  While  it  might  be  argued  that  strategies  are 
critical  components  of  task  performance  and  therefore  should 
be  measured  and  retained,  we  consider  such  strategies  to  be 
inappropriate  in  the  present  context  unless  everyone  used  the 
same  approach.  If  not,  the  same  task  would  be  measuring  dif- 
ferent things  for  different  subjects. 

Measures  from  several  of  the  tasks  seemed  to  be  redundant 
and  so  some  decision  had  to  be  made  about  which  measure  to 
retain.  For  the  Free  Recall  task,  the  abstract  nouns  condition 
was  selected  since  it  was  less  likely  to  be  affected  by  a 
visualization  strategy.  For  the  Running  Recognition  task,  the 
choice  was  the  "hits"  parameter.  Although  either  the  hits  or 
the  proportion  correct  measure  could  have  been  used,  hits  was 
selected  because  it  was  potentially  less  sensitive  to  guessing 
than  proportion  correct.  For  the  Memory  Span  task,  the  low 
similarity  measure  was  retained  since  it  was  less  susceptible 
to  the  effects  of  different  strategies  than  the  high  similarity 
measure.  The  correlation  measure  from  the  Situational  Frequency 
task  was  selected  for  the  same  reason.  The  List  Differentiation 
measure,  although  partially  interpretable  in  terms  of  a general 
memory  ability,  was  also  retained. 

In  summary,  five  of  the  six  tasks  met  the  criteria  for 
inclusion  in  a test  battery.  All  of  them  appear  to  be  more 


related  to  general  skill  in  encoding  and  storage  than  to  the 
attributes  they  were  supposed  to  measure.  The  results  parallel 
those  of  Underwood  et  al.  who  were  unable  to  demonstrate  indi- 
vidual differences  on  a variety  of  attributes.  However,  the 
two  experiments  had  different  purposes.  The  current  experi- 
ment achieved  its  desired  outcome  in  that  the  results  indicated 
a set  of  tasks  and  measures  which  provide  reliable  estimates 
of  individual  differences  in  general  memory  skills.  These 
tasks  have  been  added  to  those  from  the  previous  AIR  experi- 
ment as  candidate  tasks  for  a test  battery. 
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Introduction 


The  research  you  are  about  to  take  part  in  is  one  phase  of  a larger 
project  designed  to  help  understand  basic  human  information  processing 
capacities  and  limitations. 

The  results  of  this  project  will  be  used  to  improve  educational  and 
vocational  guidance  programs.  The  project  will,  for  example,  contribute  to 
the  matching  of  individual  qualifications  and  characteristics  as  needed 
for  specific  jobs  and  to  the  development  of  training  programs  for  various 
occupations  and  professions. 

Your  participation  in  this  project  will  require  attendance  at  two 
sessions  held  on  consecutive  days,  and  consisting  of  approximately  two 
hours  each  session. 

During  each  session  you  will  be  asked  to  complete  a series  of  tasks. 
These  tasks  involve  recognition  and  recall  of  letters  and  words,  and  do 
not  test  your  knowledge  of  general  information,  your  intelligence  or 
your  personality. 

Please  do  not  turn  any  pages  until  told  to  do  so.  Are  there  any  ques- 
tions? 


r. 


Free  Recall 
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1. 

1. 

! 

1, 
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This  task  Involves  recalling  words  from  a list  of  words  you  will  see. 

You  will  be  shown  six  successive  lists  of  20  words  each.  When  you  see  an 
asterisk  (*)  on  the  screen,  this  marks  the  beginning  of  a list.  Each  word 
win  then  be  flashed  on  the  screen  for  two  seconds.  A question  mark  (?) 
means  the  end  of  the  list.  As  soon  as  you  see  the  ?,  we  want  you  to  turn 
to  your  answer  sheet  and  write  down  as  many  of  the  words  as  you  can  remember. 
The  order  In  which  you  write  the  words  Is  not  Important  but  please 
list  a word  only  once  and  try  to  spell  It  correctly.  You  will  be  allowed 
one  minute  for  recall. 

From  previous  experiments  we  have  found  that  people  do  much  better 
at  this  type  of  task  If  they  first  write  down  those  words  that  were  presented 
last.  They  then  continue  to  recall  the  other  words  In  the  list.  We  would 
appreciate  you  adopting  this  strategy. 

The  first  list  you  will  see  Is  a short  practice  list.  This  will  be 
followed  by  six  lists  of  words  with  a short  break  between  lists.  Are  there 
any  questions? 
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Running  Recognition 

This  task  tests  how  well  you  can  recognize  words  that  you  have  seen 
before.  You  will  be  shown  a series  of  words.  Each  word  will  be  presented 
one  at  a time  for  three  seconds.  After  each  word  is  presented,  you  must 
decide  whether  or  not  you  have  seen  it  before  in  the  series. 

Please  turn  to  your  answer  sheet.  If  you  have  not  seen  the  word  be- 
fore, circle  "NEW"  on  your  answer  sheet.  If  you  have  seen  the  word  before, 
circle  "OLD"  on  your  answer  sheet.  So  that  you  can  keep  your  place  on  the 
answer  sheet,  each  word  on  the  screen  has  been  numbered  as  well  as  each 
line  on  your  answer  sheet. 

For  example,  you  may  be  shown  the  following  words,  and  would  answer 
accordingly: 

shown:  answer: 


1. 

carpet 

1. 

CfiE^ 

OLD 

2. 

table 

2. 

C@) 

OLD 

3. 

pencil 

3. 

(new) 

OLD 

4. 

carpet 

4. 

NEW 

5. 

hello 

5. 

(@ 

OLD 

Please  work  down  your  answer  sheet.  Are  there  any  questions? 


I 

I 

I 
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Interference  Susceptibility 

In  this  task,  we  are  interested  In  how  well  you  can  remember  word  and 
number  pairs.  You  will  be  shown  a list  of  five  pairs.  The  pairs  will  be 
shown  one  at  a time,  for  three  seconds  each.  The  first  part  of  the  pair  will 

be  a three-letter  word,  and  the  second  part  of  the  pair  will  be  a number  from 

one  to  five  (1-5).  After  the  five  pairs  have  been  shown,  each  word  will 
appear  by  itself,  not  necessarily  in  the  same  order  just  shown.  You  will  then 
write  down  the  number  that  was  paired  with  that  word. 

The  same  procedure  will  be  used  on  the  next  list.  This  list  will  con- 
tain the  same  words  as  before  but  they  will  be  paired  with  a different  combination 

of  numbers.  Also,  the  words  will  be  presented  in  a different  order  than 
they  were  seen  previously.  And  again,  when  each  word  appears  by  itself,  you 
are  to  respond  with  the  most  recent  number  with  which  it  was  paired. 

Let's  say  you  are  shown  the  following  list  (one  pair  at  a time): 


POP-2 
INK-1 
TUG -3 


Then  you  would  be  shown  each  word  by  itself  for  three  seconds  not 
necessarily  in  the  order  in  which  it  was  presented.  You  would  write 
down  the  numerical  partner  that  belongs  to  each  word. 


shown: 


answer: 


Next,  the  following  list  would  appear: 


INK- 2 
POP-3 
TUG-l 


And  again,  you  would  write  down  the  numerical  partner  for  each  word, 


shown: 


answer: 

TUG  

INK  

POP  

Are  there  any  questions  so  far? 

Now,  please  glance  at  the  answer  sheet  on  the  following  page  that  you 
will  be  using.  Five  spaces  are  provided  to  record  your  numbers  with  every 
list.  Each  word  in  the  list  will  be  presented  only  once.  If  you  do  not 
remember  the  number  either  make  a guess  or  leave  the  space  blank,  and 
proceed  to  the  next  word.  Please  work  down  your  answer  sheet. 

There  will  be  four  lists  using  the  same  words,  each  time  paired  with 
different  numbers.  As  you  may  notice,  there  will  be  six  sets  of  four  lists 
each;  each  set  will  contain  different  words.  Any  questions? 
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Situational  Frequency 


In  this  task,  you  will  be  deciding  how  often  a certain  word  appears 
in  a list.  You  will  be  shown  a group  of  words,  one  word  at  a time. 

Each  word  will  be  shown  on  the  screen  for  two  seconds.  After  the  list 
of  words  is  presented,  you  will  be  told  to  turn  to  your  answer  sheet. 
There,  you  will  see  each  word  again.  Next  to  each,  you  must  write  down  the 
number  of  times  that  the  word  appeared  in  the  list.  (Caution:  there  may 
be  words  on  the  answer  sheet  that  did  not  appear  on  the  screen.) 

Here  is  an  example.  Let's  say  you  were  shown  the  following  words: 

judgment 

certain 

paper 

handle 

certain 

On  your  answer  sheet  you  would  write  across  from  each  word,  the  number  of 
times  each  occurred,  like  so: 

Word  No.  time(s)  seen: 
paper  _]_ 

certain  2 

judgment  _]_ 

olive  _0_ 

handle 

You  will  be  allowed  a maximuti  of  four  minutes  for  your  answers. 

Any  questions? 


List  Differentiation 


I ‘‘X — 


In  this  task  you  will  be  shown  a series  of  four-letter  words.  The 
words  will  be  flashed  on  the  screen  one  at  a time  for  two  seconds  each. 

A total  of  three  lists  will  be  shown,  twenty  words  In  each  list.  The  first 
list  Is  List  1,  the  second  Is  List  2,  and  the  third  is  List  3.  You  will 
be  told  and  also  shown  on  the  screen  when  one  list  ends  and  the  next  one 
begins. 

When  you  are  told  to  do  so,  you  will  turn  to  your  answer  sheet.  There 
you  will  see  each  of  the  words  again,  followed  by  a 1 23.  You  should 
place  an  "X"  over  the  number  In  the  list  in  which  you  think  the  word  occurred. 
All  of  the  words  on  the  answer  sheet  appear  on  the  screen. 

Let's  say  you  are  shown  the  following: 


List  1 
hole 
page 

List  2 
stop 
turn 

List  3 
rule 
ring 


On  your  answer  sheet  you  would  mark  the  number  of  the  list  to  which  the  word 
belongs,  like  so: 

Word  List  Hhiinber 


turn 

rule 

hole 

stop 

page 

ring 


1 I 3 
1 2 % 
X 2 3 
1 1 3 
X 2 3 
1 2 I 
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Memory  Span 


For  this  task,  you  will  be  shown  a series  of  letters,  one  letter  at  a 
time.  Each  letter  will  be  shown  for  a second.  Your  job  will  be  to  remem- 
ber the  letter  string  so  you  can  write  down  the  letters,  in  the  same  order 
in  which  they  were  presented. 

At  the  start  of  each  trial,  you  will  see  an  asterisk  on  the  screen.  The 
appearance  of  the  asterisk  will  alert  you  that  the  next  letter  string  is 
about  to  appear.  After  a short  delay,  the  asterisk  will  disappear,  and  the 
letter  string  will  be  presented.  At  the  end  of  the  series,  you  will  see  a 
question  mark  on  the  screen.  This  is  the  signal  for  you  to  write  down  the 
letters  you  saw.  Be  sure  to  write  the  letters  in  the  same  order  in  which  they 
were  presented.  You  will  be  allowed  approximately  ten  seconds  for  recall. 

Are  there  any  questions  yet?  Please  turn  to  your  answer  sheets 
that  we  will  be  using.  The  spaces  on  the  answer  sheet  mark  the  places  for  each 
letter  to  be  written.  If  you  cannot  remember  one  or  more  of  the  letters, 
leave  the  appropriate  spaces  blank  to  show  where  you  think  they  were  in  the 
series.  Please  work  from  left  to  right  on  your  answer  sheets. 

There  will  be  two  blocks  of  trials;  one  answer  sheet  is  provided  for 
each  block.  We  will  have  a short  rest  in  between  blocks.  The  first  few 
trials  of  each  block  will  be  for  practice.  Any  questions? 
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